Fuzzy paraphrases in learning word representations with a lexicon

نویسندگان

  • Yuanzhi Ke
  • Masafumi Hagiwara
چکیده

We figure out a trap that is not carefully addressed in the previous works using lexicons or ontologies to train or improve distributed word representations: For polysemous words and utterances changing meaning in different contexts, their paraphrases or related entities in a lexicon or an ontology are unreliable and sometimes deteriorate the learning of word representations. Thus, we propose an approach to address the problem. We consider each paraphrase of a word in a lexicon not fully a paraphrase, but a fuzzy member (fuzzy paraphrase) in the paraphrase set whose membership (i.e., degree of truth) depends on the contexts. Then we propose an efficient method to use the fuzzy paraphrases to learn word embeddings. We approximately estimate the local membership of paraphrases, and train word embeddings using a lexicon jointly by replacing the words in the contexts with their paraphrases randomly subject to the membership of each paraphrase. The experimental results show that our method is efficient, overcomes the weakness of the previous related works in extracting semantic information and outperforms the previous works of learning word representations using lexicons.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Natural Language Processing With Modular PDP Networks and Distributed Lexicon yz

An approach to connectionist natural language processing is proposed, which is based on hierarchically organized modular Parallel Distributed Processing (PDP) networks and a central lexicon of distributed input/output representations. The modules communicate using these representations, which are global and publicly available in the system. The representations are developed automatically by all...

متن کامل

Natural Language Processing With Modular PDP Networks and Distributed Lexicon

An approach to cannectionist natural language processing is proposed, which is based on hierarchically organized modular parallel distributed processing (PDP) networks and a central lexican of distributed input/output representations. The modules communicate using these representations, which are global and publicly available in the system. The representations are developed automatically by all...

متن کامل

Natural Language Processing With Modular PDP Networks

An approach to connectionist natural language processing is proposed, which is based on hierarchically organized modular Parallel Distributed Processing (PDP) networks and a central lexicon of distributed input/output representations. The modules communicate using these representations, which are global and publicly available in the system. The representations are developed automatically by all...

متن کامل

Script - Based Inference and Memory Retrieval in

DISCERN is an integrated natural language processing system built entirely from distributed neural networks. It reads short narratives about stereotypical event sequences, stores them in episodic memory, generates fully expanded paraphrases of the narratives, and answers questions about them. Processing in DISCERN is based on hierarchically-organized backpropagation modules, communicating throu...

متن کامل

Joint Word Representation Learning Using a Corpus and a Semantic Lexicon

Methods for learning word representations using large text corpora have received much attention lately due to their impressive performance in numerous natural language processing (NLP) tasks such as, semantic similarity measurement, and word analogy detection. Despite their success, these datadriven word representation learning methods do not consider the rich semantic relational structure betw...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2016